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£ — . Abstract 

i— i Rotation distance between rooted binary trees measures the num- 

ber of simple operations it takes to transform one tree into another. 
There are no known polynomial-time algorithms for computing rota- 
^ tion distance. We give an efficient, linear-time approximation algo- 

O rithm, which estimates the rotation distance, within a provable factor 

of 2, between ordered rooted binary trees. 
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Binary search trees are a fundamental data structure for storing and re- 
trieving information [3]. Roughly, a binary search tree is a rooted binary 
tree where the nodes are ordered "left to right." The potential efficiency of 



0\ storing and retrieving information in binary search trees depends on their 

height and balance. Rotations provide a simple mechanism for "balancing" 
binary search trees while preserving their underlying order (see Figure [T]) . 
There has been a great deal of work on estimating, bounding and computing 
5— i rotation distances. By rotating to right caterpillar trees, Culik and Wood 

[5] gave an immediate upper bound of 2n — 2 for the distance between two 
trees with n interior nodes. In elegant work using methods of hyperbolic 
volume, Sleator, Tarjan, and Thurston [12] showed not only that 2n — 6 is 
an upper bound for n > 11, but furthermore that for all very large n, that 
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Figure 1: A (right) rotation at a node consists of rotating the right child of the left 
child of the node to the right child of the node. A left rotation is defined similarly 
by moving the left child of the right child of the node to the left child of the node. 
The circled node in the middle tree has been rotated right to yield the tree on the 
right, and similarly rotated left to yield the tree on the left. 

bound is realized. In remarkable recent work, Dehornoy [7] gave concrete 
examples illustrating that the lower bound is at least 2n — 0(y/n) for all n. 
There are no known polynomial-time algorithms for computing rotation dis- 
tance, though there are polynomial-time estimation algorithms of Pallo |10j . 
Pallo and Baril [T|, and Rogers [IT] . Baril and Pallo [I] use computational 
experimental evidence to show that a large fraction of their estimates are 
within a factor of 2 of the rotation distance. The problem has been recently 
shown to be fixed-parameter tractable in the parameter, k, the distance j3]. 
Li and Zhang |9 give a polynomial time approximation algorithm for the 
equivalent diagonal flip distance with approximation ratio of almost 1.970 
In this short note, we give a linear time approximation algorithm with an 
approximation ratio of 2, improving the running time at the very modest ex- 
pense of approximation ratio. This is accomplished by showing the distance 
between the trees is bounded below by n — e — 1 and above by 2(n — e — 1) 
where n is the number of internal nodes and e is the number of edges in 
common in the reduced trees. The number of common edges is equivalent 
to Robinson-Foulds distance, widely used in phylogenetic settings, which 
Day [6] calculates in linear time. 

1 The exact ratio is bounded by the maximum number of diagonals, d, allowed at any 
vertex, and is 2 - 4(d _ 1)( 2 d+6)+1 ■ 
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2 Background 



We consider ordered, rooted binary trees with n interior nodes and where 
each interior node has two children. Such trees are commonly called extended 
binary trees [8] . In the following, tree refers to such a tree with an ordering 
on the leaves, node refers to an interior node, and /ea/refers to a non-interior 
node. Our trees will have n+ 1 leaves numbered in left-to-right order from 1 
to n + 1. The size of a tree will be the number of internal nodes it contains. 
Each internal edge in a tree separates the leaves into two connected sets 
upon removal, and a pair of edges e\ in S and ei in T form a common edge 
pair if their removal in their respective trees gives the same partitions on 
the leaves. In that case, we say that S and T have a common edge. 

Right rotation at a node of a rooted binary tree is defined as a simple 
change to T as in Figure [T] taking the middle tree to the right-hand one. Left 
rotation at a node is the natural inverse operation. The rotation distance 
dn(S, T) between two rooted binary trees S and T with the same number 
of leaves is the minimum number of rotations needed to transform S to T. 

The specific instance of the rotation distance problem we address is: 

Rotation Distance: 

Input: Two rooted ordered trees, S and T on n internal nodes, 
Question: Calculate the rotation distance between them, dn(S, T). 

Finding a sequence of rotations which accomplish the transformation 
gives only an upper bound. The general difficulty of computing rotation 
distance comes from the lower bound. 

3 Approximation Algorithm 

We first show that the rotation distance is bounded by the number of edges 
that differ between the trees. From this, the approximation result follows 
easily. 

Theorem 1 Let S and T be two distinct ordered rooted trees with the same 
number of leaves. Let n be the number of internal nodes and e the number 
of common edges for S and T. Then, 

n - e - 1 < d R (S, T) < 2(n — e — 1) 
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Proof: The lower bound follows from two simple observations. First, if 
we use a single rotation to transform T\ to T2, all but one of the internal 
edges in each tree is common with the other tree. Second, every internal 
edge of S that is not common with an internal edge of T needs a rotation 
(possibly more than one) to transform it to an edge in common in T. The 
number of internal edges occurring only in S is n — e — 1 and thus, is also a 
simple lower bound. 

For the upper bound, we use two facts from past work on rotation dis- 
tance. We first let (Si,T±), (62, T2), . . ., (5 e +i,T e +i) be the resulting tree 
pairs from removing the e edges S and T have in common, where we insert 
placeholder leaves to preserve the extended binary tree property. Let rtj be 
the size of tree S{ for i = 1, 2, . . . , e + 1. The first is the observation of 
Sleator et al. [12] used before: the rotation distance of the original tree pair 
(S, T) with a common edge is the sum of the rotation distances of the two 
tree pairs "above" and "below" the common edge. Extending this to e edges 
in common between S and T, we have 

e+1 e+1 

d R (S, T) = Y^ d{Si, Ti) < 2n; - 2 = 2n - 2(e + 1) = 2(n - e - 1) 

i=l i=\ 

The inequality follows by the initial bound of 2n — 2 on rotation distance 

between trees with n internal nodes of Culik and Wood [5 . 

Thus, n- e- 1 < d R {S,T) < 2(n - e - 1). □ 
We note that using the sharper bound of 2n — 6 for n > 12 from Sleator, 

Tarjan and Thurston [12] together with the table of distances for n < 12 

can improve this slightly still further. 

These reduction rules and counting the number of common edges can be 

carried out in linear-time [21 16] , yielding the corollary: 

Corollary 2 Let S and T be ordered rooted trees with n internal nodes. A 
2- approximation of their rotation distance can be calculated in linear time. 

Proof: Let S and T be two distinct ordered rooted n-leaf trees. Let 
n be the number of internal nodes and e the number of edges in common 
for S and T. Then, by Theorem [l] n-e-1 < d R (S,T) < 2(n - e - 1). 
Since this is within a linear factor 2 from both bounds, we have the desired 
approximation. □ 

We note that this algorithm not only approximates rotation distance, 
it gives a sequence of rotations which realize the upper bound of the ap- 
proximation, again in linear time. The approximation algorithm uses the 
Culik- Wood bound on potentially several pieces. On each piece, the 2n — 2 
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bound comes from rotating each internal node which is not on the right side 
of the tree to obtain a right caterpillar, and then rotating the caterpillar to 
obtain the desired tree. This can be accomplish simply in linear time. 
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